2,086 research outputs found

    Integrative disease classification based on cross-platform microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification.</p> <p>Results</p> <p>In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank ratios across datasets. In addition, we systematically map textual annotations of datasets to concepts in Unified Medical Language System (UMLS), permitting quantitative analysis of the phenotype "distance" between datasets and automated construction of disease classes. We design a new classification approach named ManiSVM, which integrates Manifold data transformation with SVM learning to exploit the data properties. Using the leave one dataset out cross validation, ManiSVM achieved the overall accuracy of 70.7% (68.6% precision and 76.9% recall) with many disease classes achieving the accuracy higher than 80%.</p> <p>Conclusion</p> <p>Our results not only demonstrated the feasibility of the integrated disease classification approach, but also showed that the classification accuracy increases with the number of homogenous training datasets. Thus, the power of the integrative approach will increase with the continuous accumulation of microarray data in public repositories. Our study shows that automated disease diagnosis can be an important and promising application of the enormous amount of costly to generate, yet freely available, public microarray data.</p

    Income Smoothing over the Business Cycle: Changes in Banks’ Coordinated Management of Provisions for Loan Losses and Loan Charge-offs from the Pre-1990 Bust to the 1990s Boom

    Get PDF
    We provide evidence that banks smooth income by managing provisions for loan losses and loan charge-offs in a coordinated fashion that varies across the bust and boom phases of the business cycle and across homogeneous and heterogeneous loan types. In particular, during the 1990s boom, we predict and find that banks accelerated provisioning for loan losses and made this less obvious by accelerating loan charge-offs, especially for homogenous loans for which charge-offs are determined using number-of-days-past-due rules. We also provide evidence that the valuation implications of banks’ provisions for loan losses and loan charge-offs vary across the phases of the business cycle and loan types reflecting the effect of these factors on banks’ income smoothing. In particular, during the 1990s boom, we predict and find that charge-offs of homogenous loans have a positive association with current returns and future cash flows, because these charge-offs are recorded primarily by healthy banks with good future prospects reducing over-stated allowances for loan losses. We also predict and find that these charge-offs have a positive association with future returns that is explained by their positive association with future net income and recoveries. Our results are consistent with the market only partially appreciating healthy banks’ overstatement of charge-offs of homogeneous loans based on number-of-days-past-due rules during the 1990s boom, because of the perceived non-discretionary nature of these charge-offs

    REACTIN: Regulatory Activity Inference of Transcription Factors Underlying Human Diseases with Application to Breast Cancer

    Get PDF
    Genetic alterations of transcription factors (TFs) have been implicated in the tumorigenesis of cancers. In many cancers, alteration of TFs results in aberrant activity of them without changing their gene expression level. Gene expression data from microarray or RNA-seq experiments can capture the expression change of genes, however, it is still challenge to reveal the activity change of TFs. Here we propose a method, called REACTIN (REgulatory ACTivity INference), which integrates TF binding data with gene expression data to identify TFs with significantly differential activity between disease and normal samples. REACTIN successfully detect differential activity of estrogen receptor (ER) between ER+ and ER- samples in 10 breast cancer datasets. When applied to compare tumor and normal breast samples, it reveals TFs that are critical for carcinogenesis of breast cancer. Moreover, Reaction can be utilized to identify transcriptional programs that are predictive to patient survival time of breast cancer patients
    corecore